The three virtues of a programmer: Laziness, Impatience, and Hubris. – Larry Wall
Legacy:Replication Examples/Battle City
A study of replication[edit]
I recently finished and released a small mod called Battle City. The main reason for the creation of Battle City was to practice different methods of replication and to see how they functioned. The mod enveloped the replication of variables in the server to client model, client to server model as well as the replication of functions between the server and client model and the client and server model. Different approaches to each model were taken and some were noted better than others.
Replication is a very abstract idea and the idea of it isn't just send data between the server and the client or client to server, but is to maintain the integrety of the simulation between the server and the client. Let me elaborate this a little more.
With synchronous games there is no requirement to write any forms of replication as those games are designed to be run synchronous with all players. This meaning that everyone duplicates everything that is done. While this means little problems during the development of the game (what works offline will work online) it does mean problems in the clients. If games became asynchronous due to packet loss or latencies then the game is corrupted and must be abandoned. If games do not recognize that they are out of sync then the games can be continued to run without any notification with interesting player results. Games must be started all at the same time, with everyone ready. No one else can join the game concurrently. There are a lot more problems than listed here, but these are more than enough to cause frustrations on the players mind.
Asynchronous games usually use a server/client model, where one governs the others. One is seen as the correct version of the game, and others are just simulations of that game which is periodically corrected by the server. In this case, Unreal uses replication as a form to periodically correct the server or to allow events to occur. Replication is also used to prevent cheating that can occur during asynch games. The reason for this is that fooling a server to believe that the client side version of events is correct, is somewhat 'easy' to those who know what they are doing.
Battle City took the various forms of replication that were often used. The weapon/shield switching involved the client asking the server to do something with a client parsed variable. The effects of weapons/shields were the server telling the clients that something occured. The money/inventory system covered the aspect that the server was the supreme governer of variables that clients had to obey by (Couldn't buy more items if the server didn't think you had enough money). And lastly the planning/decision aspects that must be accounted for when your planning how a replication system will work.
The first was looking at asking servers to do something, i.e to switch to a different weapon type. In planning this, the basic model I approached was this:
- Client request
- Client preliminary checks
- Client send parsed function
- Server receive Client function
- Server preliminary checks
- Server accepts changes
- Server makes changes
- Server notifies Client of changes
- Client receives changes
- Client adjusts Client side variables
- Complete
The was a reason for each of these steps. First, a client request is accepted. This is purely client side code and the server does not need any participation of this. Thus there is no point to do any replication or simulation of this event, during preliminary checks the client simply checks if the client is able to do such a function. While this in itself can be hacked, it isn't a problem as the server will double check. This was used to speed up the process by possibly stopping the function if it couldn't be run anyways (for legitimate players). Once that was done the server receives the function. The replication block used for this was:
replication { un/reliable if(Role < Role_Authority) ServerSendFunction; }
This indicates that the client wishes to send something to the server. While this could also be a variable, I often find that using a function enveloped with a variable is a faster way to parse it as it also allows the server to immediately parse the variable to continue the sequence of events. In this case, the client would tell the server to run a function with a parsed variable (to indicate what weapon the client wanted to switch to).
From there the server proceeds to check the variable with its version of the current client. The weapon type currently being used was stored within the vehicle itself, and the parsed variable was checked by the server on the current version of the pawn. If this failed, the server notifies the client that the event was determined to be invalid. The model I used for this was,
replication { un/reliable if(Role == Role_Authority) ClientFunction; } function ServerParseFunction(int A) { if(!ParseVariable(A)) ClientFunction(); // Other server required changes } simulated function ClientFunction() { if(Level.NetMode == NM_DedicatedServer) return; if(PlayerController(Controller) != None) // Server/Client run function if possible. }
To my understanding, a server runs a function whether it is simulated or not. Simply placing simulated functions everywhere will not neccessarily indicate that your code will work, and the reason for this I believe is that what the simulated tag does is that it simply allows the function to be able to run on the client (that is, if the actor has the appropriate role setting). In most cases, the server also requires syntax values since you often want that function to do something with something else. If the function itself wasn't replicated from the server to the client, then the function will run will null variable replacements instead, thus creating either a null result or creating undesired results. Hence the need for replicating the client intended function from the server. Which was what is done in the example above.
The if statement checking the Level.NetMode is written to exclude dedicated servers since they could never be clients. Remember that excluding ,Role == Role_Authority will also exclude Listen servers as they are simply a client attached to a self server.
In most cases, I wanted the client to update its HUD whenever a weapon/shield was switched to a new type. Because I used the Level.TimeSeconds to time a lot of the HUD events, simply parsing the Level.TimeSeconds between the server and the client will not work as Level.TimeSeconds starts when the engine was first started. Thus the difference between a server's Level.TimeSeconds and the client's Level.TimeSeconds can be very minimal to extremely large.
However, since the HUD is only ever owned by the client (Servers do not maintain a server copy of the client's hud) then the client can be responsible to notify the HUD of the change that occured. By using the above example, I could also append to the 'if(PlayerController(Controller) != None)' with a 'PlayerController(Controller).myHud != None' to further weed out servers/clients that no longer had a HUD (and to prevent an access none from occuring).
The client side notification can be hacked, but there would be very little point in doing so, as it is the server that maintains the weapon/shield status not the client. Thus, the client side only notification can be considered 'safe to hack'.
The second element was telling a client about a server event. While this can range from simple (tank shell attacking you) to somewhat difficult (tank shell hit you, made a special effect on you and placed a timer of it's effects on you). Thus the level of planning further increases. The model I took for this approach was:
- Server event occured
- Server parses event to ensure validity
- Server generates reactions to events
- Server sends event to Client
- Client receives
- Client updates simulation
- Complete
All of the server side is pretty easy. However, when a tank had a special effect placed onto it, a small boolean was altered on the server to let the client know the effect is in effect. For example, bConcussed was set true when the concussion effect was in play and false to when it wasn't. The server governs this variable and the client is normally never allowed to change it. The replication looked a little like this,
replication { un/reliable if(bNetDirty && Role == Role_Authority) bConcussed; }
The properties of this, tell that when the variable differs between the client and server for the server to replicate the variable again. This was used for a different purpose that I will explain later on.
Previously when I first wrote this code, I would simply simulate a client function to spawn the client side effects which would then attach itself to the owner of the effect. The owner was then responsible for the destruction of the effect. However the problems that occured with this was that it used several replication steps (First replicate the effect was needed, then the effect replicate back to the server and its owner about its existence and then for the owner to sometime replicate the function to destroy the effect). This wasn't the best approach for this. Not only was it bad replication but it was also poor in maintaining the intregrety of the simulations between the server and the client but sometimes the effects wouldn't be removed or sometimes the effects would appear incorrectly.
Thus what I decided to do was to have the server set a variable which indicated the status of the effect. The client then checks for this variable and maintains its effects client side only. While the effects could be hacked, the hacking of this part isn't anything special as it would only hinder the cheater in that he wouldn't know what effect was being done on him/her.
Thus the replication code went something like this:
function TakeDamage(syntax) { if(Damaged()) { bConcussed = true; UpdateClient() } } function Tick(float DeltaTime) { if(Level.NetMode != NM_DedicatedServer) { if(bConcussed) //generate effects else Effects.Destroy(); } }
There were also extraneous code to maintain the effects, and both the owner and the effects were responsible for destroy the effect/itself. This was a double approach in removing rogue effects. On hindsight, this probably wasn't really required, not did it need to be in the tick function. Periodic checking would have been fine ... Battle City was reported to be slower on a lot of machines and that was probably due to my over use of the tick function.
Finally the cash system was a mixture of both of the above. While I won't go into elaborate details, it simple used the parts where the client requested an event (then came the client preliminary checks, etc etc) with the final sum that the server would then give an item to the player. By using a mixture of the above techniques I was able to make a rudimentary monetary system pretty quickly and easily.
What I learnt most with this mod was how to plan and setup for replication. You first of all need to figure out what exactly are you doing. In this case, say your wanting to make items purchasable, so one possible plan is this:
- Client chooses item to buy
- Client checks if it can buy said item
- Client notifies the Server of the purchase decision
- Server checks if the Server version of the Client can buy said item
- Server confirms the purchase is valid
- Server then notifies the Client the purchase was successful/unsuccessful
- Client receives the Server notification
- Client informs the player
- Complete
By breaking it down into steps, you should be able to visualize what parts need to be run on the server and what parts need to be run on the client. From here, you can plan out your function path tree and how functions will be run and in what order. Also with this much planning you can also see how you may optimize your code, as well as being able to decide which sections have to be replicated or ones that can be somewhat laxed about (unreliable replication). By thinking about the planning and setting up, when I coded things first time they worked in offline and online conditions 95% of the time. Of course, I can deny that sometimes even with good planning and setup there is a chance that it may not work on the network, but in saying that by doing this you have a much higher chance that it will work.
Remember, as elmuerte has reminded me again and again, designing, planning and thinking about how you will write you code is more important than the task of writing the code itself. In fact, I think that a good sign of a coder is when he/she is able to visualize and explain to others in what manner the code will be written.
Well, with all this said, I hope this slight revalation will help you in your adventures with replication. Replication is definately something that is hard to grasp and I still am learning more about it (You can never learn enough about replication). I did learn some other replication techniques during the development of Battle City, but I can't give everything away now can I?
Comments[edit]
Solid Snake: Thanks to Arelius for fixing up some of my comments in this document. Much appreciated.
JonAzz: In the replication blocks, shouldn't it be Reliable if ( Role == Role_Authority ) {} , right now the Reliable is missing :B
Solid Snake: You're correct, that it should have a pre syntax before the if statement, but I left it empty so that people would decide whether to use reliable or unreliable as the pre syntax (although post mentioning that might have helped). The curly brackets aren't required but some people do it for obsufication reasons. You're allowed to do:
if(''syntax'') Blah blah;
or
if(''Syntax'') { Blah Blah; }
Tarquin: I suggest moving this page to Replication Examples/Battle City:
Vitaloverdose; im not sure what the 'un/reliable' indicates, i havent seen it written like that anywhere else. Is it reliable or unreliable ?. Its not explained anywhere in the tutorial.
Kartoshka: The "reliable" function tells the server to check to see if the packet containing the replicated variable was sent successfully or not, and keep sending it until it is successful. The "unreliable" function sends the packet, but does not perform this check. As this is just a general example of replication, I believe the author left it up to the end-user to determine which function he would like to use.